FCC: Modeling Probabilities with GIZA++ for Task 2 and 3 of SemEval-2

نویسندگان

  • Darnes Vilariño Ayala
  • Carlos Balderas Posada
  • David Eduardo Pinto Avendaño
  • Miguel Rodríguez Hernández
  • Saúl León
چکیده

In this paper we present a naı̈ve approach to tackle the problem of cross-lingual WSD and cross-lingual lexical substitution which correspond to the Task #2 and #3 of the SemEval-2 competition. We used a bilingual statistical dictionary, which is calculated with Giza++ by using the EUROPARL parallel corpus, in order to calculate the probability of a source word to be translated to a target word (which is assumed to be the correct sense of the source word but in a different language). Two versions of the probabilistic model are tested: unweighted and weighted. The obtained values show that the unweighted version performs better thant the weighted one.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Idiom Savant at Semeval-2017 Task 7: Detection and Interpretation of English Puns

This paper describes our system, entitled Idiom Savant, for the 7th Task of the Semeval 2017 workshop, “Detection and interpretation of English Puns”. Our system consists of two probabilistic models for each type of puns using Google n-grams and Word2Vec. Our system achieved fscore of 0.84, 0.663, and 0.07 in homographic puns and 0.8439, 0.6631, and 0.0806 in heterographic puns in task 1, task ...

متن کامل

ICL00 at SemEval-2016 Task 3: Translation-Based Method for CQA System

We participate in the English subtask B and C at SemEval-2016 Task 3 “Community Question Answering”. This paper is concerned with the description of our participating system. We propose a ranking model that combines a translation model with the cosine similarity-based method. Compared to the traditional bag of words method, the proposed model is more effective because the relationships between ...

متن کامل

COLEPL and COLSLM: An Unsupervised WSD Approach to Multilingual Lexical Substitution, Tasks 2 and 3 SemEval 2010

In this paper, we present a word sense disambiguation (WSD) based system for multilingual lexical substitution. Our method depends on having a WSD system for English and an automatic word alignment method. Crucially the approach relies on having parallel corpora. For Task 2 (Sinha et al., 2009) we apply a supervised WSD system to derive the English word senses. For Task 3 (Lefever & Hoste, 2009...

متن کامل

FCC: Three Approaches for Semantic Textual Similarity

In this paper we describe the three approaches we submitted to the Semantic Textual Similarity task of SemEval 2012. The first approach considers to calculate the semantic similarity by using the Jaccard coefficient with term expansion using synonyms. The second approach uses the semantic similarity reported by Mihalcea in (Mihalcea et al., 2006). The third approach employs Random Indexing and ...

متن کامل

SemEval-2015 Task 15: A CPA dictionary-entry-building task

This paper describes the first SemEval task to explore the use of Natural Language Processing systems for building dictionary entries, in the framework of Corpus Pattern Analysis. CPA is a corpus-driven technique which provides tools and resources to identify and represent unambiguously the main semantic patterns in which words are used. Task 15 draws on the Pattern Dictionary of English Verbs ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010